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Abstract — Most coding theorems in quantum Shannon theory 
can be proven using the decoupling technique: to send data 
through a channel, one guarantees that the environment gets 
no information about it; Uhlmann's theorem then ensures that 
the receiver must be able to decode. While a wide range 
of problems can be solved this way, one of the most basic 
coding problems remains impervious to a direct application of 
this method: sending classical information through a quantum 
channel. We will show that this problem can, in fact, be solved 
using decoupling ideas, specifically by proving a "dequantizing" 
theorem, which ensures that the environment is only classically 
correlated with the sent data. 

Our techniques naturally yield a generalization of the Holevo- 
Schumacher-Westmoreland Theorem to the one-shot scenario, 
where a quantum channel can be applied only once. 

Index Terms — Coding, Decoupling, HSW Theorem, Smooth 
entropies. 



I. Introduction 

One of the most fruitful ideas that arose in quantum 
Shannon theory in the past few years is that of decoupling: the 
fact that, in quantum mechanics, the absence of correlations 
between two systems implies perfect correlations of those 
two systems with a third one. More precisely, the core idea 
is as follows: suppose that we have a tripartite pure state 
\p)abCi an d that we know that the reduced state on AB 
is a product state, i.e. trc[|p)(p|] = pa <Ei pb- Then, we 
know from the unitary equivalence of purifications that there 
exists a partial isometry Vc^c A c B with the property that 
V\p) — \iP)ac a ® W)bCb- in other words, if A and B are 
completely uncorrected, then C contains perfect correlations 
with both A and B. Furthermore, this observation remains true 
if the state on A and B is only close to a product state, as can 
be shown via Uhlmann's theorem [23]. 

This observation can be used to prove coding theorems 
for quantum Shannon theory problems. To see this, suppose 
that we have a channel Ta^b, with a Stinespring dilation 
(U-j-)a^be, and that we want to use this channel to send 
quantum information from Alice (who has access to the input 
A) to Bob (who receives the output system B). Let ipM (with 
purification \iP)mr) be the state of the message Alice wants 
to send, and let Wm^a be the encoding isometry she uses 
to map her state to the channel input. After encoding the 
state and sending it through the channel, we have (Prbe '■= 
U T WipW^u]-. Now, suppose that the encoding operation is 
such that <pre = i'R^fE- Then, the argument in the previous 
paragraph tells us that there exists an isometry Vb^me 1 such 
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that ViprbeV^ = 'ipMR <8> £ee' f° r some state £. If we then 
trace out EE', we see that V acted as a decoder to recover 
the initial state ipMR- One can also show that the condition 
that R and E be decoupled is not only sufficient but necessary 
in order to be able to transmit arbitrary quantum information. 
This simplifies our task as information theorists: as long as 
we can design an encoder W that ensures that this decoupling 
condition is fulfilled, we know that a decoder must exist, and 
do not need to explicitly construct it. Furthermore, our aim 
becomes to destroy correlations rather than to ensure their 
presence, which seems to be a rather less delicate task at first 
glance. 

To enforce the decoupling condition, a number of decou- 
pling theorems have arisen [12], [1], [7], [8]. The version 
from [7], [8], whose approach we will broadly follow here, 
goes as follows. Let Ta-^e be a complementary channel for 
Ta^,b and let par be a quantum state. We consider the 
state (f ®1r)({U a ® ^r)par{U a ® Ir)) on ER, where 
Ua is chosen randomly according to the Haar measure on 
V(A). It turns out that this state is decoupled (i.e. that it is 
close to T(l/cU) <S> Pr in trace distance) if the state and 
the channel fulfill a certain entropic criterion, namely that 
H^ in (A\R)p + H^ in (A'\E) T > (these smooth min-entropies 
will be defined in the next section). Roughly speaking, the 
first term measures how hard the state par is to decouple, 
and the second term measures the "decoupling power" of the 
channel T; if the decoupling power of the channel exceeds 
the difficulty of decoupling the state, decoupling does indeed 
happen. 

By appropriately applying the outlined procedure, one can 
get a variety of coding theorems. This general approach has 
now become a staple of quantum Shannon theory, and has 
been used in quantum state merging [12], state transfer (also 
known as "Fully Quantum Slepian-Wolf") [1], for sending 
quantum information through quantum channels [10], for 
quantum broadcast channels [9], quantum channels with side- 
information [6], among other examples. 

The common point in all of the previous papers is that they 
use this argument to send quantum information. For sending 
classical information, on the other hand, the argument does 
not work directly. The reason for this is that if one sends 
classical information, the channel environment (the system 
E above) can also receive a copy of the message without 
impairing the protocol. However, it turns out that for the 
protocol to work, E can only share classical correlations 
with the message; in particular, E cannot contain any phase 
information about the message, otherwise Bob cannot decode. 
Hence, while the vast majority of quantum Shannon theory 
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can now be done using decoupling methods, classical coding 
over quantum channels, the so-called Holevo-Schumacher- 
Westmoreland theorem (HSW theorem) [11], [18], remains a 
notable outlier. The purpose of this paper is to close this gap 
and provide a decoupling proof, based on the above argument, 
of the HSW theorem. 

The results presented here have a somewhat similar flavor 
to those presented in [16], but a rather different emphasis. 
In both papers, the idea that the environment cannot have 
information about the phase of the classical message arises 
as a central theme. In [16], this occurs in the context of using 
complementary bases to get coding theorems from privacy 
amplification and information reconciliation, whereas here it 
arises as a natural analog of the concept of decoupling. 

The paper will be structured as follows. Section II will 
explain the notation and basic concepts needed for this paper, 
Section III will give a dequantizing theorem, which will be 
the analog of the decoupling theorem that we will need for 
the classical case, and Section IV will show how to use it to 
derive coding theorems for sending classical information over 
quantum channels. Finally, we discuss the results in Section 
V. 

II. Preliminaries and Notation 
A. Quantum States and Maps 

Let H be a finite dimensional, complex Hilbert space. The 
set of linear operators on H will be denoted by C(H), the 
set of Hermitian operators by C^H) and the set of positive- 
semidefinite operators is given by V(H). The set of quantum 
states is given by S—(H) := {p <G V(H) | trp = 1} and the set 
of subnormalized quantum states is S<{H) := {p G V(H) | 
trp < 1}. A subscript letter following some mathematical ob- 
ject denotes the physical system to which it belongs. However, 
when it is clear which systems are described we might drop the 
subscripts to shorten the notation. Given two physical systems 
A and B, the joint bipartite system AB is represented by a 
tensor product space Ha ®Hb =■ Hab- 

We will denote by 1a the identity operator on Ha 
and by tta '■= ^a/Aa the completely mixed state on A, 
where dA = dim "Ha- For dA > ds the states Tab '■= 
£ Ei B M\a ® I*X*Ib and $ AB : = -L \i)(j\ A ® \i)(j\ B 
in S = (Hab) represent maximal classical and, respectively, 
quantum correlations between the systems A and B. 

Suppose \iI>)ab is a P ure state of the bipartite system 
AB (i.e. the system is in the state ipAB = IV'XV'Iab) and 
dA > ds- Then there exist lists of orthonormal vectors 

{\i)A}i=i,...,d B G Ha and {\i) B } i=1 dB e H B such that 

\i>)AB = Y,iMi)A\i) B , where A; > and £\ A? = 1 [15]. 
The corresponding basis {|i)}i=i,...,d B is called Schmidt basis 
and the numbers Aj are Schmidt coefficients. 

A quantum state pab G S<(Hab) is said to be classical 
with respect to a fixed basis {|i)}»=i 1 ...,d j4 of Ha if Pab G 
span R {|i)(i| i= i j ... j( i A } ( 8>£ 1 *('HB). If in addition pab is not clas- 
sical on Hb we call it a hybrid classical-quantum or shortly 
CQ-state. Moreover, we call a state pxx'B G S<(Hxx'b) 
coherent classical on X and X' if it commutes with the 
projector P X x> = Y. x \ x )( x \x ® \x%x\ x >- 













B 


A — 


T 


— B 


A — 


U r 


— \ x ! 

—I E' \ 



Fig. 1: Diagram illustrating the purification of a CQ-channel. 
The environment (depicted with a dashed box) of a CQ- 
channel can be split into two parts, a register X that contains 
a copy of the input and a system E' . 



Linear maps from £(Ha) to £(Hb) will be denoted by cal- 
ligraphic letters, e.g. Ta^b G Rov(i(C(Ha), C{H b )). Quan- 
tum operations are in one-to-one correspondence with trace 
preserving completely positive maps (TPCPMs). The TPCPM 
we will encounter most often is the partial trace (over the 
system B), denoted teB(-)> which is defined to be the adjoint 
mapping of 7a->ab(£a) = £a ® 1b for £a G C^{Ha) with 
respect to the Schmidt scalar product (A, B) := tv(A^B). 
This means tr((£ A <g> IbKab) = &(£a tr B (CAs)) for any 
Cab G &(Hab)- Given a bipartite state £ab> we write 
(-A '■= tr b£ab for the reduced density operator on A and 
(~b '■= tr a^ab-i respectively, on B. If £ab is pure, we call 
|£) ab a purification for £a and £_b- 

The map C(-)a = Ej NX*U( - )KX*|a classicalizes an ar- 
bitrary density operator on A by removing all off-diagonal 
elements. When C is applied to part of a bipartite state pab, we 
get the CQ state p c \ B := (Ca ®%b)(pab)- Here, Ib denotes 
the operator identity on B, which we will only write explicitly 
if it is not clear from the context. 

The Choi-Jamiolkowski representation [3], [13] of 
Ta^b G Hom(£(HA),£(HB)) is given by the operator 
wi'B := (Ta^b <E>1a>)($aa>), where Ha> is a copy of 
H A - We say that Ta^b G Kom{C{U A ),C{U B )) has 
classical-quantum (CQ) structure if its Choi-Jamiolkowski 
representation is a CQ-state. For a map T with Choi- 
Jamiolkowski uja'B we define the map T cl to be the 
unique map whose Choi-Jamiolkowski representation is 
ui% B = Ca'(^A'b)- 

In our context it will also be important to purify quantum 
channels. Given a TPCPM Ta^b G Hom(£(% A ), C(H B )), 
we define the unitary (U-t)a^be to be any particular Stine- 
spring dilation of T. The purifying system E will be called the 
environment of the channel. For a channel Ta^b, we define 
the complementary channel Ta^e ■ X ^ Xxs^rXCUr)^] to 
be the channel to the environment. 

The purification of a CQ-channel T with T(£a) = 
E i tr(|i)(i|A^)pl 1 is given by U T \i) a = \i) x ® \p [l] ) be-. 
Thus, the environment of such a channel can conceptually 
be split into two parts: a register X, which contains a copy 
of the input to the channel and a system E' which stems 
from the purification of the operators p B \ See Figure 1 for 
an illustration of this. The Choi-Jamiolkowski representation 
of a complementary channel of a CQ-channel can be written as 
wa'E'x, where the systems X and A' are classically coherent. 
Furthermore, we will frequently be considering channels that 
are complementary to CQ channels; we will call such channels 
"complementary CQ channels". 

The swap operator Faa' acting on the bipartite space Haa< 



3 



is given by F AA , := £\ ^ \i)(j\ A ® IjX^a'- It is easy to verify 
that for any Ma, Na 1 € C{Ha) the swap operator satisfies 
tr(M A A^) = tr((M A ® Na>)F A a')- 

For any operator in £a € C(V.a) we denote by ||£a||i and 
||£a||2 the Schatten 1 and 2-norms of £4, respectively. These 
norms are unitarily invariant and satisfy ||£a||2 < < 
V^AllCyilb- The metric induced on £(H) via the Schatten 1- 
norm is D(p,<r) := \\p — a\\i. Another measure of closeness 
between states on V(H) is the fidelity, F(p, a) :— \\y^py/a\\i- 

B. Permutation operators 

The symmetric group Sd is the set of all bijective maps of 
{1, ...,<i} to itself together with the concatenation of maps 
as the group multiplication. Elements 7r of Sd are called 
permutations. Let % be a Hilbert space together with a fixed 
basis {\i)}i=i,...,d- For it G Sd, we define the permutation 
operator P(tt) on % such that P(7r)|i) = |7r(i)). The group 
of all such matrices will be denoted by P. Typically in this 
paper {|i)}i=i,...,d will be the Schmidt basis of a given density 
matrix. The above permutation matrices then act by reordering 
the elements of this basis. 

Given a random variable X : P — > SI (fl some measurable 
space), we shall write Ep[X] := 2~2pep X(P) for the 
expectation value of X with respect to the uniform probability 
distribution on P. 

C. Smooth entropies 

Entropies are used to quantify the uncertainty an observer 
has about a quantum state. Moreover, conditional entropies 
quantify the uncertainty of an observer about one subsystem of 
a bipartite state when he has access to another subsystem. The 
most commonly used quantity is the von Neumann entropy. 
Given a state pab € S = {Hab), we denote by H{A\B) p := 
FI(pab) — H(ps) the von Neumann entropy of A conditioned 
on B, where H{p) := — tr(plogp). 

While the von Neumann entropy is appropriate for analyzing 
processes involving a large number of copies of an identical 
system, the min-entropy is relevant when a single system is 
considered [17]. 

Min-Entropy [17] Let pab € S<(Hab), then the min- 
entropy of A conditioned on B of pab is defined as 

J ff min (A|B) p := max sup{A G M | pab < 2- x t A ®(j B }. 

More generally, the smooth conditional min-entropy is defined 
as the largest conditional min-entropy one can get within a 
distance of at most e from p. Here closeness is measured 
with respect to the purified distance, P(p, a), which is defined 
as [22] 

P(p,a) := y/l-F(p,v)*, 

where F(p, a) is the generalized fidelity; F(p, a) := F(p, a) + 
V / (l-tr p)(l-trcr) for p, (7 E S<(H). The purified distance 
constitutes a metric [22] on S<(H) and satisfies the Fuchs-van 



de Graaf inequalities 

§ IIP — + l|tr/? — trcr) <P{p,a) 

< y/Wp-a^ + ltip-tiCTl (1) 

We say that p is e-close to p, denoted p s=y e p, if P(p, p) < e. 

Smooth Min-Entropy [17], [22] Let e > and let p A B 6 

S<(Hab) with v / tfp > e, then the e-smooth min-entropy of 
A conditioned on B of pab is defined as 

#min(A|B), = maxi/ mm (A|B)p, 

p 

where we maximize over all p K> e p. 
Next, we define the smooth max-entropy. 

Smooth Max-Entropy [17], [22] Let e > 0, let p A B & 

S<(Hab) and let pabc € S<(Habc) be an arbitrary purifi- 
cation of pab- The e-smooth max-entropy of A conditioned 
on B of pab is defined as 

H^ x (A\B) p = -H^(A\C) p . 

The fully quantum asymptotic equipartition property (QAEP) 
states that in the limit of an infinite number of identical 
states the smooth min- and max-entropies converge to the von 
Neumann entropy [21], [20]: Let pab <= S = {Hab), then 

H(A\B) p = lim -H^{A n \B n ) p ® n 

n— >oo n 

= lim —H^(A n \B n ) p ®n. (2) 

In that sense, the smooth conditional min- and max-entropies 
can be seen as one-shot generalizations of the von Neumann 
entropy. 

D. Uhlmann's theorem and existence of a decoding operation 

To prove a coding theorem it is necessary to establish the 
existence of a decoding operation. That is, given a quantum 
state that results from the execution of some quantum channel, 
we would like to recover the message originally encoded 
into the input of the channel. It turns out that this can be 
achieved if and only if the environment of the channel and 
some reference system purifying the original message are left 
uncorrected after the execution of the channel. In this situation 
the existence of a decoding operation follows from Uhlmann's 
Theorem [23], which we shall state here for completeness. 

Theorem 2.1 (Uhlmann's Theorem): Let pa,<Ja € S('Ha) 
be two quantum states with respective purifications \<j>) A b and 
Mac- Then , 

F(p A ,a A )= max \(ip\V\(f>)\, 

Vb^c 

where the maximization goes over all partial isometries from 
B to C. 

Since our decoupling results involve the Schatten 1-norm 
rather than the Fidelity it will be useful to transform the above 
theorem into a statement formulated in terms of Schatten 1- 
norms. The following Corollary [5] follows from Uhlmann's 
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Fig. 2: Illustration of the dequantizing theorem. The top 
diagram illustrates the situation in which we apply the de- 
quantizing theorem: we apply a random permutation P to par 
following by the channel. The bottom diagram illustrates the 
"ideal" state we would like to get at the end: a state containing 
only classical correlations between R and E. 



Theorem with an application of the Fuchs van de Graaf 
Inequalities (cf. Equation (1)). 

Corollary 2.2: Let pab, a ' ab € S(Hab) be pure quantum 
states and assume that \pa — o~a\\i < £■ Then there exists 
some isometry Ub^c suc h that \\U~pab — (Jab\i < 2 v / £. 

III. Dequantizing Theorem 

In this section, we will derive the dequantizing theorem 
which will be the core technical ingredient for our coding 
theorems. Our aim will be to derive conditions under which 
the output of a channel contains only classical correlations 
with a reference system. More precisely, we will prove the 
following: 

Theorem 3.1 (Dequantizing Theorem): Let Ta^ex be a 
complementary CQ channel, and let loa'ex € S<(Ha'ex) 
be its Choi-Jamiolkowski representation. Let par be a pure 
state on War, and let p c \ R :— Ca(par)- Then 



t(Pa{par 



Par, 



pi 



< 



i 



d A -i 



2-- ff m 1 „(A'|EX) 1 ,-ff mm (A|R) Pj ( 3 ) 



where the permutation operators act by permuting the 
Schmidt-basis vectors of par- 

In other words, Theorem 3.1 gives a bound on how close the 
state T(PaParPa) is from a state containing only classical 
correlations between R and E (namely, T(PaP c arPaY)- ^ ee 
Figure 2 for an illustration. 

The rest of this section is devoted to the proof of Theorem 
3.1 and is organized in three subsections. In the first one 
we calculate the above expectation value with the Schatten 
1-norm replaced by the Schatten 2-norm. We conclude the 
proof of the above theorem in the second subsection showing 



how the statement found about the Schatten 2-norm can be 
transformed into the one above. In the third subsection we 
reformulate the upper bound of Theorem 3.1 using the smooth 
conditional min-entropy. This enables us to make statements 
about independent, identically distributed channels via the 
QAEP, Equation (2). 



A. Dequantizing with Schatten 2-Norms 

We first prove a statement that holds for general hermiticity 
preserving, linear maps Na^b € Wom(C{H a) ^{W. b)) with 
Choi-Jamiolkowski representation uja'B € &{Ha'b)- In 
Proposition 3.3 below, we will compute the expectation value 



E 



Af(P A (pAR-p C l R )P A ) 



d A II d ||2 || cl || 2 

d A -i " PAR ~ Par " 2 \r A ' B ~ uj a'b\\ 2 ■ 



where the permutation operators act by permuting the basis 
vectors of the Schmidt-basis 1 . To prove this, we first need the 
following lemma: 

Lemma 3.2: Let Ha be a Hilbert space with orthonormal 
basis {\i)}i=i.....d A and let P be the corresponding set of 
permutation operators. Then for any i ^ j one has that 

E PeP {P® 2 (\i)(j\ A ®\j)(i\ A )(PY 2 ) 

= d A {dA-l) {FAA '~ dATAA,) - 

Proof: There are c^! permutation operators in P. For 
arbitrary but fixed i ^ j and k ^ I there are (cU — 2)! 
permutation operators with {PA)® 2 \i) A®\j) A' = \k)A®\l)A'- 
On the other hand there is no permutation such that for i ^ j 
the operator (Pa)® 2 maps \i) A ® \j)a* to \U)a ® \1)a' with 
k = I. We conclude that 

E ((PA)* 2 (M\A®M\A-)(Pi)® 2 ) 



(d, 



2)\ 



d A 



d A \ 



£>XJ| A ®IO<fck. 

k^l 



Proposition 3.3 (Distance from classicality): Let 
Na^b € Hom(£(?i J 4), £(?^b)) be a linear map with 
Choi-Jamiolkowski representation uja'B € &(HbA'), and let 
\p)ar = J2i V^i\H)AR. Then 



E 



m(p a (par-P c 1r)P a ) 



d A 



cl 



= dA - i " PAR ~ Par " 2 lr A ' B _ W ^'B||2 ' 
where the operators P permute the Schmidt-basis vectors of 

PA- 

Proof: Rewriting the Schatten 2-norm in terms of the 
trace, we get 

'An extension of this result to an arbitrary basis is known [19]. 
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E 

p 



N{P A <8> 1r (par - Pak) ^1 ® Ifl) 
E tr (V(Pa ® l fl - p^) pt lfl )2^ 



E tr [ AT £ v^A; PaWO'IaPI ® N)01r 
£ AjAj [E tr (AA (PxliXjUPl) (Pa| jXiUPl) ) 



tr (/v^ 2 (E (pf(\i)(j\A « liXik)^!)® 2 )) ^) } 

(4) 

d R 

= E 



AjAj 



rf A (c?A - 1) 
|/MP - P^flU 



tr 



((Paa' — Taa 1 )) Pbs') (5) 



(tr (TV® 2 (P AA F bb ,) - tr ((AA c1 )^ 2 (P^) F BB ,)) . 

(6) 

Equation (4) is by an application of the swap trick and 
in equation (5) we applied Lemma 3.2. To simplify (6) we 
evaluate the term tr (A/ - ® 2 (Fa A') Pbs') using the inverse 
Choi-Jamiolkowski isomorphism: 

tr(A^ 2 (PaaO^W) 
= 4 tr (tr AA - (wfs (Paa' ® Iss')) ftfl-) 
= 4tr (Paa' ® W) (1W ® PbsO) (7) 
= 4 tr (w^s) . 

In Equation (7) we used the fact that the adjoint mapping of 
the partial trace is tensoring with the identity. Analogously, 
one has 

tr ((W cl ) 02 (F AA ,) F BB ,) = 4 tr (« B ) 2 ) 

and the second factor of (6) becomes 

tr (W® 2 (F AA ,) F BB .) - tr ((M C T 2 (*UaO Pbb') 
= 4 (trK s )-tr((^ B ) 2 )) 

,2 II cl ||^ 

= a A \\u) AB - u AB \\ 2 . 



B. Dequantizing with the Schatten 1-norm 

In this subsection we derive Theorem 3.1 with an applica- 
tion of Proposition 3.3. We use the following lemma. 

Lemma 3.4: For any £ AR € S<(Har), there exists an 
operator ( R <G S=(Hr) with 

^(((UeG''^) 2 )^^..™, 



Proof: Choose ( R such that it maximizes the min-entropy, 
i.e. it satisfies £ A r < 2~ H ^ A ^( t A ® ( R . Hence, 

VUR~(t A ® Cp%ap(1a ® Ca')v^ < 2- ff - (A|R) «^H 

Taking the trace on both sides concludes the proof. ■ 
Proof of Theorem 3.1: We first introduce some notation. 
We abbreviate the difference between par and its classicalized 
version by writing par ■= Par — Par- By Lemma 3.4 there 
are operators a ex and t r such that 



1 



ti[UA'Ex\ 

and 



*(((!, 



-l/2\ 
a EX I^A'EX 



< 2-- ff m,„(A'|EX) ll 



Since Ta'^ex is a complementary channel of a CQ-channel 
we can assume that (T^x nas CQ-structure, i.e. a ex = 
^2 X a x E ® |x)(x| x . Furthermore the operator p^p is given 
in its Schmidt-basis, such that t r can be written as t r = 
J2 X r x\x}(x\ R . (Both facts follow from Lemma A.l in Ap- 
pendix A with e = 0.) 

We introduce a system P with (d A \) -dimensional Hilbert 
space, Hp, and canonical basis vectors |P)p that correspond 
to the permutation operators of P e P. We define the operator 

(pREX ■ = 

E (f>|PXP|p ® \x)( X \ R ®a P E {x) ® |P(x)XP(x)| x ) , 



which can be inverted on its support to yield 



an 

Cprex ■■= dM ]T (^r-^PXPIp ® ® 

pgr x=i 

(^^"^ip^xp^u). 

We note that the operator C^C ^ is a projector and 

E |P)(P| P ® f (P4 ® IpPApP] ® la) 



(C*C 4 ) [E|PXP| P ®T(P4®lpPApPi 



Using these operators, we write 



W\ (C*C*) 

(8) 



E 

p 

= E 

p 

= E 



T(P A p AR P\ 
\P)(P\ P ®{f(P A p AR Pl)) 

|p)(p| P ® (r(P4PA fl pi)) 



< ^(CPflBx)^"' (f|PXP|p®T(PAPAflPi))r l 



(9) 



Inequality (9) follows from Equation (8) together with 
an application of the Holder-type inequality JABC^ < 
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|| 1^1 Hi || \ B \ Hi || 1^1 Hi 4 ' [2]- The trace term on the ri 8 ht 
hand side of Inequality (9) can be evaluated directly to be 



tr(C 



PREX 



x=l 



■ E 



tr (a 



P(x) 
E 



1 



Thus, it is sufficient to evaluate the term with the Schatten 
2-norm. For notational convenience we introduce the map 
T(-) := (&Ex)~ 1! 'T~(-)(aEx)~ 1 with Choi-Jamiolkowski 
representation lua'ex and the operator par '■= (^A ® 
tr)~^Par{^a . Using the fact that T is the comple- 

mentary channel of a CQ-channel one can verify that 



C-i(E\P)(P\r®T(P A PARP\)) C* 



(< 



Per 



vex®t r ) *T{P a parP a ){<tex<S: 

l=J2\P)(P\l>®f(P A PARPi)- 



Per 



Using this, we find 

Ci(E\P)(P\ P ®T(P A PARP A )) C* 



E 



t(pa PAi? pi) 



where the permutation operators act by permuting the 
Schmidt-basis vectors of p A . 

Proof: Let lua'ex € S<(H a >ex) be a state that 
saturates the bound in the definition of the smooth min- 
entropy, i.e. P(uj A <ex, ^A'ex) < e' and iJ min (A|EX) 2 = 
^min( A 'l EX )^- Analogously, p AR satisfies P(par,Par) < e 
andiJ mm (A|R) ? = ^ ln (A|R) p . 
Using inequality (1), we find that 

\\uj A 'EX -^A'Ex\\i<2e' \\par - Aajj lli < 2e. (10) 

We decompose uj — ut into positive operators with orthogonal 
support writing to — uj = A + — A_ and conclude from (10) 
that lA+l-L < 2e' and fA-l^ < 2s'. 

Similarly, we decompose (p AR - Par) - (par - Par) = 
r + — T_ with the operators T + and T_ again chosen to be pos- 
itive and with orthogonal support. From the second inequality 
)T R y^ J in (10), we conclude that fr+l^ < 4e and < 4e. 

Let T, T> + and 2?_ be the unique Choi-Jamiolkowski 
preimages of lja'ex, A + and A_ respectively. We note that 
from the fact that uja'ex is classically coherent between 
A' and X it follows that lja'ex also shares this property 
(See Appendix A, Lemma A.l). Furthermore the state par. 
is classically coherent between A and R, such that the state 
Par has the same Schmidt-basis as par (Lemma A.l). We 
therefore can apply Theorem 3.1 on the states uj and p to find 



,~,cl 



^A'EX — ^A'EX o \\PAR 



< 



< 



d A -l 

d ^ A _ 1 tt (&a'ex) tt {Par) 
d A 



Par 



1 



dA 



2- ff mL( A, | EX )--^i„( A l R ) P 



2- ff n,i„(A'|EX) s -H m , n (A|R) ? 



- H mm ( A ' I EX) „ - H min ( A | R) p 



d A 



d A -l 

where we apply Proposition 3.3 to obtain the second equality 
and use the special choice of a ex and tr (cf. Lemma 3.4) for 
the last inequality. Plugging this into Equation (9) concludes 
the proof. ■ 

C. A smoothed version of the dequantizing theorem 

The reason for introducing smooth versions of the min- and 
max-entropy is that these quantities can vary a lot with small 
variations in the underlying states while the quantities that 
we are bounding with them do not. This is such a case; it 
is therefore desirable to have a version of the dequantizing 
theorem which involves the smooth entropies. The smooth 
quantities have the additional advantage that they converge 
to the corresponding von Neumann quantities in the i.i.d. case 
(cf. Equation 2). We therefore prove the following: 

Theorem 3.5: Let Ta^ex be a complementary chan- 
nel of a CQ-channel, let oja'EX € S<(Hexa<) be the 
Choi-Jamiolkowski representation of T, and let \p)ar = 
J2i \f)n\ii)AR. Let s,s' be such that y/tr(p) > e > and 
y/tr(uj) > s' > 0. Then, 



> E 

p 



f(PA(f>AR-p C iR)P\) 



Applying the triangle inequality twice shows that for any 
permutation operator we have 

f(p A {PAR-n R 



> 



)p\) 

t(Pa(par-P c a\r)Pa) 

(r-f) {p A {p A R-ffX R )p\ 

f (P A ((par - P c \r) - (par - P^r)) Pa) 



(11) 



The first term on the right hand-side of Inequality (11) 
corresponds to the unsmoothed dequantizing theorem. For the 
remaining two terms we find upper bounds: 



E 

p 



(r-f) (p A (fiAR-n R )p\ 



^ E ®«(d«(paparP a )) 



E 



f(PA(pAR-p%)P\) 



+ E Etr(D (p^pt)) 

ae{+.-} 

2 E ®tr(p a (p A p A Pl)) 



(12) 



< 



1 



d A -l 



2-^n(A'|EX)^-H= in (A|R) p + 8e + g £ / 



ae{+.-} 
< 2 (tr(A+) +tr(A_)) < 8e' 
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We bound the third term in a similar way. We have 



E 



t(p a {{ 



P \R - Par) - {Par - ?ar)) p\ 



< E ftr(f(p^r Q pt 

ae{+.-} 
s$ tr(r+)+tr(r_) < 8s. 



(13) 



Substituting the expressions (12) and (13) into Inequal- 
ity (11) shows that 



E 

p 



\f(P A (pAR-nR)P\) 

> E ||f (Pa( P ar - P C \ R ) P\) \l - Se - 8s' 
and an application of Theorem 3.1 concludes the proof. 



IV. From dequantizing to coding 

In the following subsections we show how the dequantizing 
theorem (Theorem 3.5) yields a one-shot coding theorem for 
classical information. Then, in the last subsection, we apply 
this coding theorem to the iid scenario and obtain the HSW 
theorem as a corollary. 

A. Sending classical information through a quantum channel 
Consider the following scenario: Alice wants to send a 
classical message M to Bob using a quantum channel Ta^b- 
For this purpose she encodes her message using an encoding 
TPCPM £m^a into the input A of the channel. Having re- 
ceived the output of the channel B, Bob will apply a decoding 
TPCPM Dg^M aiming to recover the original message. Since 
we are interested in the transmission of classical data through 
a quantum channel, the operation £ can be assumed to be 
classical, while T is a CQ-channel. 

Alice's message M is assumed to exhibit perfect correla- 
tions with some classical reference system R. This means that 
the joint state of the message and the reference can be rep- 
resented by the operator if c ^ IR — J2i=i -M^)(m|mr for some 
probability distribution A. The aim is that after decoding, Bob 
holds a system M which contains Bob's decoded message. 
Naturally, we want M to contain the same message as M with 
high probability, which is equivalent to saying that M is almost 
perfectly correlated to the reference system R. (See Figure 3 
for an illustration of this scenario.) Mathematically, this means 
that we want the probability of error p e to be bounded as 



2p e 



(V 



° Ta^b ° £m^a){^P 



air) 
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MR. 



< S. 



In other words, the state after encoding, the channel, and 
decoding is within s in trace distance to a state that is perfectly 
correlated between M and the copy of the message in R. 

B. The purified picture 

To apply the dequantizing theorem, it is necessary to work 
with pure states and operations. Hence, for our derivation 
we will consider the setup depicted in Figure 3, but where 
all states and operations are replaced with the corresponding 
purifications. 




Fig. 3: Diagram illustrating the transmission of classical data 
through a quantum channel. The state if c MR represents perfect 
classical correlations of a message M and a reference R. The 
aim is to obtain after decoding a system M with nearly perfect 
classical correlations to R. 



The present subsection shows how the dequantizing theorem 
(Theorem 3.1) can be used to show the existence of a decoding 
operation; in the following subsection, we will apply the 
theorem to get the encoder. 

First, we purify the state <£> C MR from above to \(p)mr = 
Z^j V / AiK«) mr- Next, we will assume that the encoder £ is 
actually a partial isometry Vm->a ; this is slightly less general, 
but it will turn out to be enough for our purposes. Then, 
we replace T by its Stinespring dilation Uj^ BXE , where 
X contains a copy of the classical input, as explained in the 
preliminaries (Section II). Likewise, the decoder V becomes 
the partial isometry U 1 ^ -~ , with an "environment" system 

B — > M Ex? 

Et>. See Figure 4 for an illustration of the purified picture. 

Now, we will show that if dequantizing holds, then a suitable 
decoder must exist. Consider the two states p and p in Figure 
4, which are the states immediately before and immediately 
after the decoder, and look at the reduced states on R, E and 
X. Since these subsystems are untouched by the decoder, we 



RXEME-d 



have that p RX E = Prxe, and prxeb and p 
both purifications of this state. The decoder is then simply 
the partial isometry that relates them and which is guaranteed 
to exist by the unitary equivalence of purifications. Hence, as 
long as p R xE is of the right form, we know that a suitable 
decoder must exist. 

We must now find out what this "right form" is. Note that 
since the encoder is classical, and the channel is CQ, one can 
show that P RX emEt, must h ave the form 

\p)rxemev = 2 V^*l*>« ® W)eme^ 

for some set of states {l^ 1 )} and some permutation ir. Fur- 
thermore, we know that the error probability must be low; this 
means that \p) must be close to a state of the form 



\0 



RXEMEt> 



= E >A*l*>fl ® l 7r (*))x ® \i)m ® \° 1 )ee v , 



where here the decoder output M is perfectly correlated with 
the message in R (and {\0 1 )} is some set of states). Tracing 
out MEd in £, we get 

£,RXE = ^2 A i|" r ( i )X i7r ( i )U* ® 0E- 

i 

Note that this £ has only classical correlations between the 
three systems — this is the "right form" that we need for the 
channel output. 
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Fig. 4: Diagram illustrating the completely purified scenario. 
Since T is a CQ-channel the corresponding environment can 
be split up in two parts X and E. 



Hence, we have reduced the problem to finding an encoder 
that ensures that the output of the channel has almost only 
classical correlations, and this is precisely what the dequan- 
tizing theorem does. 

C. A one-shot classical coding theorem 

We now put the pieces together and derive a coding the- 
orem based on the argument in the previous subsection. We 
need to obtain an encoding operation with the property that 
tr B (f o£(ip MR )) » £. Ai|»7r(i)Xi7r(i)|jtx:®(^,.On the other 
hand, note that by sending the classically correlated state ip^R 
through the channel (as opposed to \<p)mr), we automatically 
get a state of this form, regardless of the encoder. Our strategy 
will therefore be to show that 



T{£{ V mr))^T{£{^ r )). 



(14) 



We are now in a position to use the dequantizing theorem. 
The encoder is constructed as follows: we first embed the 
message M into the input space of the channel A. (We could 
denote this using a partial isometry, but to avoid cluttering 
the notation we will simply consider Hm to be a subspace of 
Ha from now on.) We then apply a permutation on the basis 
elements of A, as in Theorem 3.5. We will then show that, 
if we average over the choice of permutations, this strategy 
works. It then follows that a suitable permutation exists. 

Applying Theorem 3.5 to the scenario at hand, we get that 
there is a permutation operator such that 

T{Pa(vmr-^r)P ] a) 

< J 1 2-- g m.n(A|EX)^-g mm (M|R) y + g £ _ 

V d A - 1 

This gives precise bounds for (14) above. We now get the 
decoder using Corollary 2.2: there exists a TPCPM T> B ^ 
such that 



V (UtPa&mr - Vmr)Pa^ t ) 



< 24 



d A -l 



Tracing out the systems X and E, we get 



VoT(PA^MRP\)-^1f R 



< 2 




2-^;„( A l EX )--^ m ,„( M l R )^ +8e. 



By the duality of the smooth min- and max- 
entropies [22], we have -H min (M\R) ip = H max (M) ip 
and -H^{A\EX) U = H^(A\B)^ yielding the following 
theorem: 

Theorem 4.1: Let Ta^b be a quantum channel with 
Choi-Jamiolkowski representation ljab, and let \<p)mr = 
J2iV^i\ii)MR, where Aj is a probability distribution. Then, 
we have that there exists a permutation Pa on the basis 
elements of A such that 




Hence, if the right-hand side of the above inequality is small 
enough, then the scheme succeeds. We formulate this fact as 
the following corollary of the preceding theorem: 

Corollary 4.2: Let Ta~>b be a quantum channel with Choi- 
Jamiolkowski representation ujab, and let M be a set of 
messages with \M\ < dA and p a probability distribution on 
M. Then, there exists an encoder and a decoder for Ta^b 
with error probability p e > if 

i?max(M) p < 

log d A - H^{A\B) U - 1 + 2 log (pi - 8e) . 

v 2 

for some constant e, < e < ^f. 

Proof: The abvoe inequality ensures that the right-hand 
side of (15) is at most 2p e . ■ 

D. The i.i.d scenario and the HSW theorem 

Applying the above to a channel of the form T®^ B allows 
us to easily recover the HSW theorem. Recall that the HSW 
theorem states that there exists a family of codes for T with 
increasing block length n with a vanishing error probability as 
n — > oo as long as the rate Q is less than I(X, B) T , where r 
is a state of the form txb — 2~2 x Px\ x )( x \ x ® 7"( <T 1)' where 
p x forms a probability distribution. (The rate of a code for 
n uses of a channel is — log K, where K is the number of 
possible messages sent.) The only challenge facing us when 
attempting to prove this is to relate the quantity logc^ to 
nH(X) T . We do this using the idea of types, explained very 
briefly in Appendix B. The result is the following theorem: 

Theorem 4.3 (Holevo [11], Schumacher-Westmoreland [18]): 
Let Ta->b be a CQ channel, let q be a probability distribution 
over the set X, and let txb ■= J2xex l( x )\ x )( x \x ® T(a A ), 
where {a A : x g X} is a set of states on A. Then, there exists 
a family of codes for T®" whose rate approaches I(X; B) T . 
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Proof: Let p £ V n {%-) be the most likely type under the 
distribution q n , and let A' be a system with H A > C %f ™ of 
dimension \t(p)\ defined as 

Ha> '■= sp&ii{\x)a™ ■ x G t(p)}. 
Now, consider the channel 7^,^ B ®„ defined as follows: 

T\U>) = E <2|£|a>T®>*), 

Set(p) 

where ag — a Xl ® • • • ® cr Xn . Furthermore, let w^, B „ be the 
Choi-Jamiolkowski state of T', and let us apply Corollary 4.2 
to the uniform distribution over some message set M and 
channel T'. We get that there exists an encoder and a decoder 
such that 

log \M\ < logd A < - H^ x (A'\B n ) u - o(n) 
= ]og\t(p)\-H^ m {A'\B n ) u -o(n) 

From the properties of types, we have that \t(p)\ ^ 
\V n {T)\- l 2 nH ^ . Also, note that 

= n i(p) T lB n t(p)/ tr [ n t(p) T fS]- 

Using Lemma A. 2, we get that 

H^ aK (A'\B n ) u> < i^ ax (X"|B") r «„ +logtr[n t(p) ^]. 
Hence, we now have that 

log|M| < log |t(p)| 

- H s max (X n \B n ) T ® n - log tr[n t(p) r%] - o(n). 

Choosing log I^Vf | = nQ for a transmission rate of Q and 
using the above bound on \t(p)\ and the fact that the most 
likely type p satisfies tr[n t ( p )T®g] ^ | T^n. | 1 , this bound 
becomes: 

Q ^ H(X) T - -H^(X n \B n U n - o(l). 
n 

Taking the limit as n — > oo, we get that this bound is 
satisfied whenever 

Q < H(X) T - H(X\B) T = I(X; B) T , 

where we have used the fully quantum asymptotic equiparti- 
tion property of [21] to bound the H max term above. Since 
this is true for any e > 0, the theorem holds. ■ 

V. Conclusion and further work 

In this article, we show that it is possible to derive direct 
bounds for the capacity of classical-quantum channels using 
decoupling-like techniques, therefore adding the transmission 
of classical data to the list of problems that are amenable to the 
decoupling approach to coding. Our derivation also naturally 
leads to bounds in the one-shot setting, where the channel is 
only used once and we allow a finite error probability. 

We want to emphasize, however, that the bounds resulting 
from our calculation are somewhat weaker than the best known 
one-shot direct bounds, found for example in Mosonyi and 
Datta [14] and Wang and Renner [24]. Furthermore, our one- 
shot result only applies to uniform inputs of the channel 



and the method of types is needed to shape the input into 
this form in order to achieve the HSW capacity. The latter 
weakness could potentially be overcome inside the decoupling 
framework, using an analogue of Theorem 3.14 in [7]. 

Appendix A 

Technical facts about the smooth entropies 

Here we establish some useful properties of the (smooth) 
min-entropy of classically coherent states. In particular for a 
state pxx'AB that is classically coherent between X and X' 
we show that the state ox 1 b that optimizes 

H min (XA\X'B) p 

= max sup (AeR: pxx'AB < 2~ x t X A ® <Jx>b] 

°X'B 

can be chosen to have CQ-structure. Furthermore we show 
how the min-entropy of a classically coherent state behaves un- 
der smoothing. The following lemma is a direct consequence 
of the results obtained in [22]. 

Lemma A.l: Let pxx'AB be coherent classical on X and 
X'. Then, there exists a state pxx'AB € B e (pxx' ab) that is 
coherent classical on X and X 1 and a state ax> b € S< (Hx'b) 
that is classical on X' such that 

H^(XA\X'B) p 

= sup {A e M : pxx'AB < 2-H X a®o~x>b}- 

Proof: Let pxx'AB € B £ (pxx'Ab) be a state 
that maximizes the smooth min-entropy, namely it satis- 
fies H^ in (XA\X'B) p = H min (XA\X'B)p. Then, the state 
Pxx'AB = Pxx'Pxx'AbPxx' satisfies the criteria. 

First, note that we have P(p~xx>AB, Pxx'Ab) < 
P{pxx'AB, Pxx'Ab) < £ due to the monotonicity of the 
purified distance under projections [22]. Second, by definition 
of the smooth min-entropy, there exists an operator ox f b such 
that, for A = H^ in (XA\X'B), we have 

PXX'AB < 2~ A txA <8> OX'B ■ 

Thus, 

Pxx'AB < 2~ A Pxx'(^xa®o X 'b)Pxx> 

= 2- x \x){x\ x ®1a® \x)(x\ x < ® (x\d X ' B \x) 

X 

< 2- x t XA ® l x X x l*' ® (x\d X ' B \x) . (16) 



— - a X'B 

Finally, we note that tr(crx's) < 1 and, thus, Eq. (16) implies 
that H^ in (XA\X'B) p > A, which concludes the proof. ■ 
Lemma A.2: Let pab € S<(Hab), let be an operator 
such that < ^ 1a, and let e ^ 0. Furthermore, let 
Pab ■= KapabTIa. Then, H^ K (A\B) p , ^ H^(A\B) P . 
Proof: Let pab € B e (p) and cs be such that 

2 H^{A\B) P = F (p AB7 t A g, OB f_ 

Let p' := UpU e B £ (p'), and let ujb be such that 

2 H ^ A \ B h> =F{p> AB ,t A ®u B f. 
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Then, we immediately have that 

2 H ™^ A \ B )> =F{p AB ,t A ®a B f 
F(p AB ,l A ®u b ) 2 



AB 



n 



» w B ) 2 



tr 



tr 



(1 A ® v^^A/OAB IIa(1U 



= J F(n A( 5 AB n A ,l A ®a; i 5) 2 

= 2 ff max(A|B) (5 , 

«(A|B)„/ 



^2 

Taking logarithms then yields the lemma. ■ 

Appendix B 
The method of types 

The "method of types" is a technique that is used extensively 
in classical information theory and that we need here to 
make statements about discrete memoryless channels. For a 
complete introduction to this method, we refer the reader to 
[4]; we will only give here the facts needed for our paper. 
The basic idea goes as follows. Let X be a finite set, and let 

X — X X ■ • ■ "&ti 

e I" be a sequence of n symbols from X. For 
any x € X, let pg(x) be the relative frequency of the symbol 
a; in a; (i.e. the number of occurences of x in x divided by n). 
We call the distribution pg the type of x, and, given a type 
p, we define t(p) to be the set of all sequences of type p. 
Furthermore, we define V n (X) to be the set of all types over 
X n . 

We now list some basic properties of types: 

. \v n (x)\ = ( n +w- 1 ). 

• For any type p e V n (X), we have that 

ITVaOr^"^) <c \t( p )\ sc: 2 nH ^. 

• For any type p e V n (X) and any probability distribution 
q over X, we have that 

|7> n (£)|- 1 2- nD < p ll«> < < 2- nD(p ^K 

Set(p) 

• For any probability distribution q and any n, the most 
likely type p has total probability 

]T q n (x) > \V n (X)\-\ 
Set(p) 

Note that |"P„(X)| is polynomial in n and becomes negligible 
in most expressions involving exponentials of entropies. 

To use these concepts in quantum information, we will 
define type projectors. Let X be a |X|-dimensional quantum 
system, with a basis vector \x) for each x £ X. Letp € P n (X); 
we define the type projector il t ( p ) as 

n *(p) = i^i' 

Set(p) 

where \x) = \xi) <£> • • • <g> |x„) for x = xi . . . x n . 
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